Isarn Dialect Speech Synthesis using HMM with syllable-context features
نویسندگان
چکیده
منابع مشابه
Phone set selection for HMM-based dialect speech synthesis
This paper describes a method for selecting an appropriate phone set in dialect speech synthesis for a so far undescribed dialect by applying hidden Markov model (HMM) based training and clustering methods. In this pilot study we show how a phone set derived from the phonetic surface can be optimized given a small amount of dialect speech training data.
متن کاملHMM-Based Emphatic Speech Synthesis Using Unsupervised Context Labeling
This paper describes an approach to HMM-based expressive speech synthesis which does not require any supervised labeling process for emphasis context. We use appealing-style speech whose sentences were taken from real domains. To reduce the cost for labeling speech data with an emphasis context for the model training, we propose an unsupervised labeling technique of the emphasis context based o...
متن کاملUsing Bayesian Networks to find relevant context features for HMM-based speech synthesis
Speech units are highly context-dependent, so taking contextual features into account is essential for speech modelling. Context is employed in HMM-based Text-to-Speech speech synthesis systems via context-dependent phone models. A very wide context is taken into account, represented by a large set of contextual factors. However, most of these factors probably have no significant influence on t...
متن کاملSpeech Recognition Using Advanced HMM 2 Features
HMM2 is a particular hidden Markov model where state emission probabilities of the temporal (primary) HMM are modeled through (secondary) state-dependent frequency-based HMMs [12]. As shown in [13], a secondary HMM can also be used to extract robust ASR features. Here, we further investigate this novel approach towards using a full HMM2 as feature extractor, working in the spectral domain, and ...
متن کاملStress and accent transmission in HMM-based syllable-context very low bit rate speech coding
In this paper, we propose a solution to reconstruct stress and accent contextual factors at the receiver of a very low bitrate speech codec built on recognition/synthesis architecture. In speech synthesis, accent and stress symbols are predicted from the text, which is not available at the receiver side of the speech codec. Therefore, speech signal-based symbols, generated as syllable-level log...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ECTI Transactions on Computer and Information Technology (ECTI-CIT)
سال: 2018
ISSN: 2286-9131,2286-9131
DOI: 10.37936/ecti-cit.2018122.108607